{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 数据分析学习\n",
    "\n",
    "## task2 论⽂作者统计\n",
    "\n",
    "<img src=\"图片/作者统计.png\" >"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 导⼊所需的package\n",
    "import seaborn as sns #⽤于画图\n",
    "from bs4 import BeautifulSoup #⽤于爬取arxiv的数据\n",
    "import re #⽤于正则表达式，匹配字符串的模式\n",
    "import requests #⽤于⽹络连接，发送⽹络请求，使⽤域名获取对应信息\n",
    "import json #读取数据，我们的数据为json格式的\n",
    "import pandas as pd #数据处理，数据分析\n",
    "import matplotlib.pyplot as plt #画图⼯具"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "首先了解一下string类型的换行符：\n",
    "\n",
    "|(在行尾时)\t|续行符|\n",
    "|--|--|\n",
    "\\\t|反斜杠符号\n",
    "'\t|单引号\n",
    "\"\t|双引号\n",
    "\\n\t|换行\n",
    "\\t\t|横向制表符\n",
    "\\r\t|回车\n",
    "\n",
    "还有几种函数：\n",
    "\n",
    "|方法\t|描述|\n",
    "|--|--|\n",
    "string.capitalize()|\t把字符串的第一个字符大写\n",
    "string.isalpha()\t|如果 string 至少有一个字符并且所有字符都是字母则返回 True,否则返回 False\n",
    "string.title()\t|返回\"标题化\"的 string,就是说所有单词都是以大写开始，其余字母均为小写(见 istitle())\n",
    "string.upper()\t|转换 string 中的小写字母为大写\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = 'Hello world'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Hello world'"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#把字符串的第一个字符大写\n",
    "a.capitalize()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#如果 string 至少有一个字符并且所有字符都是字母则返回 True,否则返回 False\n",
    "a.isalpha()\t"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'HELLO WORLD'"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#string 中的小写字母为大写\n",
    "a.upper()\t"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "以下是对数据的处理：\n",
    "\n",
    "方便处理数据，首先选择了三个字段进行读取，分别是作者，种类，作者的分析\n",
    "\n",
    "这里的核心是for循环读取：\n",
    "\n",
    "enumerate() 函数用于将一个可遍历的数据对象(如列表、元组或字符串)组合为一个索引序列，同时列出数据和数据下标。\n",
    "\n",
    "这一段如果由pandas实现：\n",
    "\n",
    "`pd.read_json(\"arxiv-metadata-oai-snapshot.json\",usecols=['abstract','categories','comments']`\n",
    "\n",
    "不过好像不大行，read_json不支持usecols参数\n",
    "\n",
    "具体看一看pandas在github上的一个issue\n",
    "\n",
    "https://github.com/pandas-dev/pandas/issues/19821\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = []\n",
    "with open(\"arxiv-metadata-oai-snapshot.json\", 'r') as f: \n",
    "    for idx, line in enumerate(f): \n",
    "        d = json.loads(line)\n",
    "        d = {'authors': d['authors'], 'categories': d['categories'], 'authors_parsed': d['authors_parsed']}\n",
    "        data.append(d)\n",
    "        \n",
    "data = pd.DataFrame(data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "接下来我们将完成以下统计操作：\n",
    "\n",
    "统计所有作者姓名出现频率的Top10；  \n",
    "统计所有作者姓（姓名最后一个单词）的出现频率的Top10；  \n",
    "统计所有作者姓第一个字符的频率：\n",
    "\n",
    "这里筛选用到了apply函数：\n",
    "\n",
    "再复习一下上个月的pandas：\n",
    "\n",
    "apply中可以添加自定义函数或者式lambda函数\n",
    "\n",
    "返回值可以是标量或者Series或者式DataFrame"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 选择类别为cs.CV下面的论文\n",
    "data2 = data[data['categories'].apply(lambda x: 'cs.CV' in x)]\n",
    "\n",
    "# 拼接所有作者\n",
    "all_authors = sum(data2['authors_parsed'], [])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "处理完成后all_authors变成了所有一个list，其中每个元素为一个作者的姓名。我们首先来完成姓名频率的统计。\n",
    "\n",
    "`authors_names[0].value_counts().head(10).plot(kind='barh')`\n",
    "\n",
    "\n",
    "画图中使用的函数：\n",
    "\n",
    "value_counts:计数；head(10)取前十；`kind=barh`:单系列柱状图\n",
    "\n",
    "`_ = plt.yticks(range(0, len(names)), names)`\n",
    "获取和设置y轴的当前刻度位置和标签:范围是0到姓名的长度"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0.5, 0, 'Count')"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAArIAAAFzCAYAAADCJeoMAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAqoklEQVR4nO3df7hcVX3v8ffHJCYEJFiD3Ig/jtIoDUQQIhVEBbVajV60okBtBbVGr4rFyq1Uq2JtbdBqQSn2Cb0Ktf6CAv4Ar1CVKIIQEggkQUBviVUKKoIRREDi9/4xOzAez++cc+bsc96v5znP7Fmz91rfme3gJ2vW7ElVIUmSJLXNQ3pdgCRJkjQWBllJkiS1kkFWkiRJrWSQlSRJUisZZCVJktRKBllJkiS10uxeF6Dxt3Dhwurr6+t1GZIkScNat27dbVW161iONchOQ319faxdu7bXZUiSJA0ryffHeqxLCyRJktRKBllJkiS1kkFWkiRJrWSQlSRJUisZZCVJktRKBllJkiS1kpffmoY23LyFvhMu6HUZkjSozSuX97oESdOAM7KSJElqJYOsJEmSWskgK0mSpFYyyEqSJKmVpnWQTbI1yfquv74JHGt1kmXN9peT7LI9ffRrX5bkI+NQpiRJ0rQx3a9a8Muq2neyB62qF45zf2uBtePZpyRJUttN6xnZgSTZP8k3kqxLcmGSRU376iQnJVmT5MYkz2ja5yc5K8l1Sc5LcsVAs6b9xticZGGz/flmrE1JVjRts5KckWRjkg1J3tp1+MsHqOGQJOdPyAsiSZLUUtN9RnaHJOub7ZuAVwAfBQ6rqp8kOQL4O+A1zT6zq+qAJC8E3gM8F3gjcEdVLUmyN7Ce0XlNVd2eZAfgyiTnAH3A7lW1N0C/ZQgD1TCsJiSvAJi1866jLFGSJKl9pnuQ/Y2lBU0Q3Rv4jyQAs4BbuvY/t7ldRydsAhwMnAJQVRuTXDvKGt6S5KXN9mOAxcANwBOSfBS4ALhomBqGVVWrgFUAcxctrlHWKEmS1DrTPcj2F2BTVR04yOP3NrdbGYfXJskhdGZUD6yqu5OsBuZV1R1J9gGeD7yBzkzxtlnhca1BkiRpupppa2RvAHZNciBAkjlJ9hrmmEvpBE2SLAGWjmK8BXSWJdydZE/gaU0/C4GHVNU5wF8D+43uaUiSJGlGzfhV1X1JDgc+kmQBned/MrBpiMNOA85Mch1wfbPvlgH2m82Ds6nbfAV4Q5Lv0AnRlzftuwOfSLLtHxJ/NYanI0mSNKOlyuWUQ0kyC5hTVfck2QP4KvCkqrqva5+5wPeAvatqoJA7qeYuWlyLjj6512VI0qA2r1ze6xIkTRFJ1lXVkFeEGsyMmpEdo/nAxUnm0Flj+8Z+IXYZ8EngtKkQYiVJkmYKg+wwqupOYNB/JTQ/VvB7k1eRJEmSwCA7LS3dfQFr/dhOkiRNczPtqgWSJEmaJgyykiRJaiWDrCRJklrJICtJkqRWMshKkiSplQyykiRJaiWDrCRJklrJICtJkqRWMshKkiSplQyykiRJaiWDrCRJklrJICtJkqRWMshKkiSplQyykiRJaiWDrCRJklppdq8L0PjbcPMW+k64oNdlSNKYbF65vNclSGoJZ2QlSZLUSgZZSZIktZJBVpIkSa1kkJUkSVIrTUiQTce3krygq+3lSb4yAWMdk6SSPLer7SVN2+HN/X9JsmScxjsxyfH92jYnWTge/UuSJGlkJiTIVlUBbwA+nGRekp2A9wNvmojxgA3AkV33jwKu6arnz6rqugkaW5IkST0wYUsLqmoj8CXg7cC7gX8D/i3J1UkuS/IkeGBG9dwkX0ny3SQf2NZHktcmuTHJmiSnJzl1kOEuAQ5IMqcJzb8LrO/qZ3WSZc32XUn+Lsk1SS5PslvTvkdzf0OSv01y12ifc5Idk1zQ9L0xyRFN+wMztkmWJVndbB+Q5NujeU0kSZLUMdFrZN8L/DHwAuBk4BlV9RQ6wfb9XfvtCxwBLAWOSPKYJI8C3gU8DXg6sOcQ4xTwVeD5wGHAF4fYd0fg8qraB/gm8Lqm/RTglKpaCvxwmOf11iTrt/0Bj2ra/xD476rap6r2BoZbSnE9o3hNhulLkiRpRpnQH0Soql8k+RxwF7AzcEaSxXSC55yuXb9WVVsAklwHPA5YCHyjqm5v2s8GnjjEcJ8F3gIsAN4GvGOQ/e4Dzm+21wF/0GwfCLyk2f408A9DjPWPVfXA40k2N5sbgA8lOQk4v6ouGaIPmlrPHMVr8oPBOkqyAlgBMGvnXYcZVpIkqf0m46oFv27+3gdc3MxUvhiY17XPvV3bWxlDwK6qNXRmLxdW1Y1D7PqrZg3viMZqliFsm3kdroYbgf3oBNq/TfLu5qH7efC17n7e4/aaVNWqqlpWVctmzV8wXKmSJEmtN5mX31oA3NxsHzOC/a8EnpXk4UlmAy8bwTEnMPhM7HAu7xrjgS+OVdU7q2rfqtp3uA6a5RB3V9W/AR+kE2oBNgP7N9vdz2O0r4kkSZIakxlkPwD8fZKrGcGMa1XdTGfN6BrgUjphcMswx/zfqrp4jPUdB/xFkmvpfFlsyLEGsRRY08zevgf426b9vcApSdbSmV3dZlSviSRJkh6UBz9ln3qS7FRVdzUzsucBH6+q8yZorPnAL6uqkhwJHFVVh03EWBNt7qLFtejok3tdhiSNyeaVy3tdgqRJlGRdVS0by7FTfRbwxOaHDuYBFwGfn8Cx9gdOTRLgZ8BrJnAsSZIkbacpHWSr6vjh9xq3sS4B9pms8SRJkrR9JnONrCRJkjRupvSMrMZm6e4LWOsaM0mSNM05IytJkqRWMshKkiSplQyykiRJaiWDrCRJklrJICtJkqRWMshKkiSplQyykiRJaiWDrCRJklrJICtJkqRWMshKkiSplQyykiRJaiWDrCRJklrJICtJkqRWMshKkiSplQyykiRJaqXZvS5A42/DzVvoO+GCXpchSdtl88rlvS5B0hTnjKwkSZJaySArSZKkVjLISpIkqZUMspIkSWqlngTZJP+Y5Liu+xcm+Zeu+x9K8hcTNPasJOuSPLOr7aIkL2+2v5xkl4kYW5IkSeOnVzOylwIHASR5CLAQ2Kvr8YOAyyZi4KraCrwRODXJnCRHAb+uqrObx19YVT+biLElSZI0fnoVZC8DDmy29wI2AncmeXiSucDvAVcleXeSK5NsTLIqSQCSrE5yUpI1SW5M8oymfX6Ss5Jcl+S8JFckWdZ/8Kq6Avg2cCLwfuDN2x5LsjnJwmb7883s7aYkK7r2OSrJhqauk7raX9vUsybJ6UlObdpf3NRydZKvJtmtaT8xyceb5/OfSd7S1de7ktyQ5FtJPpPk+HF43SVJkqaNnlxHtqr+O8n9SR5LZ/b128DudMLtFmBDVd2X5NSq+huAJJ8EXgR8qelmdlUdkOSFwHuA59KZab2jqpYk2RtYP0QZfwX8ADi5qr43yD6vqarbk+wAXJnkHGAucBKwP3AHcFGSlwBrgHcB+wF3Al8Hrmn6+RbwtKqqJH8G/CXwtuaxPYFDgYcBNyT5GLAv8DJgH2AOcBWwbojnQhO0VwDM2nnXoXaVJEmaFnr5gwiX0QmxBwEfphNkD6ITZC9t9jk0yV8C84HfATbxYJA9t7ldB/Q12wcDpwBU1cYk1w4x/jObsfYeYp+3JHlps/0YYDGwG7C6qn4CkORTTV8A36iq25v2s4EnNu2PBj6XZBHwUOCmrjEuqKp7gXuT/Ljp/+nAF6rqHuCeJF9iGFW1ClgFMHfR4hpuf0mSpLbr5VULtq2TXUpnacHldGZkDwIuSzIPOA04vKqWAqcD87qOv7e53cooA3mSHYEPAM8GHtnM6vbf5xA6s7wHVtU+wNX9xh+NjwKnNs/j9Qz8PGAMz0WSJGmm6mWQvYzOUoHbq2prM5O5C50wexkPhr3bkuwEHD6CPi8FXgGQZAmdkDyQdwNnVdX1dJYj/GMTnLstoLNM4e4kewJPa9rXAM9KsjDJLOAo4BvAlU37w5PMprM0oLuvm5vto0f4PF6cZF7z3F80gmMkSZJmlF7O/m2gc7WCT/dr26mqbgNIcjqd2dpb6QTF4ZwGnJnkOuB6OksRtnTvkGQv4KV01p9SVVcnuRB4O/Derl2/ArwhyXeAG+jMGFNVtyQ5AbgYCJ2lAV9o+n4/naB7ezP+trFPBM5OcgedtbOPH+pJVNWVSb4IXAv8qHldtgx1jCRJ0kyTqumznLKZIZ1TVfck2QP4KvCkqrpvksbfqaruamZkzwM+XlXnbWdf84FvAiuq6qqRHDt30eJadPTJYxlWkqaMzSuX97oESZMgybqq+q2rTI3EdFuPOR+4OMkcOrOlb5ysENs4Mclz6SyLuAj4/Hb0tapZHjEPOHOkIVaSJGmmmFZBtqruBMaU6Mdp/HG71mtV/fF49SVJkjQdTasgq46luy9grR/JSZKkaa6XVy2QJEmSxswgK0mSpFYyyEqSJKmVDLKSJElqJYOsJEmSWskgK0mSpFYyyEqSJKmVDLKSJElqJYOsJEmSWskgK0mSpFYyyEqSJKmVDLKSJElqJYOsJEmSWskgK0mSpFYyyEqSJKmVDLKSJElqpdm9LkDjb8PNW+g74YJelyFJk2bzyuW9LkFSDzgjK0mSpFYyyEqSJKmVDLKSJElqpRkdZJPcNUDbG5K8ahR99CXZOED73yR57vbWKEmSpIH5Za9+quqfx6mfd49HP5IkSRrYjJ6RHUiSE5Mc32yvTnJSkjVJbkzyjFH0c0aSw5vtdye5MsnGJKuSZKj+k8xPclaS65Kcl+SKJMsm4vlKkiS1lUF2eLOr6gDgOOA9Y+zj1Kp6alXtDewAvGiY/t8I3FFVS4B3AfuPcVxJkqRpyyA7vHOb23VA3xj7OLSZVd0APBvYa5j+DwY+C1BVG4FrhxsgyYoka5Os3Xr3ljGWKUmS1B4G2eHd29xuZQxripPMA04DDq+qpcDpwLzx6n+bqlpVVcuqatms+QvG2o0kSVJrGGQn3rbQeluSnYDDR3DMpcArAJIsAZZOUG2SJEmtNdOvWjA/yQ+77n94jP08qV8/b922UVU/S3I6sBG4FbhyBP2dBpyZ5DrgemAT4HoBSZKkLqmqXtegfpLMAuZU1T1J9gC+Cjypqu4byfFzFy2uRUefPJElStKUsnnl8l6XIGmMkqyrqjFdnWmmz8hOVfOBi5PMAQK8caQhVpIkaaYwyE5BVXUn4HVjJUmShuCXvSRJktRKzshOQ0t3X8Ba14tJkqRpzhlZSZIktZJBVpIkSa1kkJUkSVIrGWQlSZLUSgZZSZIktZJBVpIkSa1kkJUkSVIrGWQlSZLUSgZZSZIktZJBVpIkSa1kkJUkSVIrGWQlSZLUSgZZSZIktZJBVpIkSa1kkJUkSVIrze51ARp/G27eQt8JF/S6DEmaNJtXLu91CZJ6wBlZSZIktZJBVpIkSa1kkJUkSVIrGWQlSZLUSq0Iskm2JlmfZFOSa5K8LcmE1J6kL8nGZvuQJOf3e/z5TS3rk9yV5IZm+18noh5JkiQNrC1XLfhlVe0LkOSRwKeBnYH3jOTgJLOr6v7B7o9GVV0IXNj0sxo4vqrW9htvVlVtHUv/Q5mofiVJktqoFTOy3arqx8AK4M3p6EtySZKrmr+D4IHZ1EuSfBG4boD7s5J8MMmVSa5N8vrtqSvJ5iQnJbkKeHmS5yX5dlPT2Ul2SvKHSc7uOuaBGd8kRyXZkGRjkpO69rkryYeSXAMcuD01SpIkTSdtmZH9DVX1n0lmAY8Efgz8QVXdk2Qx8BlgWbPrfsDeVXVTkkP63V8BbKmqpyaZC1ya5CKgtqO0n1bVfkkWAucCz62qXyR5O/AXwPuBVUl2rKpfAEcAn03yKOAkYH/gDuCiJC+pqs8DOwJXVNXbhhq4eT4rAGbtvOt2PAVJkqR2aN2M7ADmAKcn2QCcDSzpemxNVd00yP3nAa9Ksh64AngEsHg7a/lcc/u0po5Lm/6PBh7XLGf4CvDiJLOB5cAXgKcCq6vqJ80+nwKe2fS1FThnuIGralVVLauqZbPmL9jOpyFJkjT1DTkj28x6nlRVx09SPSOS5Al0At6P6ayT/RGwD51gfk/Xrr/od2j3/QDHNmteu/vu247StvUf4D+q6qgB9vks8GbgdmBtVd2ZZKg+73FdrCRJ0m8bcka2CVAHT1ItI5JkV+CfgVOrqoAFwC1V9WvgT4FZI+zqQuB/JZnT9PvEJDuOU5mXA09P8rtN3zsmeWLz2DfoLHF4HZ1QC7AGeFaShc0/Ho5q9pMkSdIgRrJG9urmC1Jn0zWjWVXnTlhVv22H5iP6OcD9wCeBDzePnQack+RVdD627z8LO5h/AfqAq9KZEv0J8JLxKLaqfpLkGOAzzfpbgL8Gbqyqrc0XvI6hs+SAqrolyQnAxXRmcy+oqi+MRy2SJEnTVTqTmkPskHxigOaqqtdMTEnaXnMXLa5FR5/c6zIkadJsXrm81yVIGqMk66pq2fB7/rZhZ2Sr6tVj6ViSJEmaSMNetSDJo5Ocl+THzd85SR49GcVJkiRJgxnJ5bc+AXwReFTz96WmTZIkSeqZkayRXb/t52GHatPUsWzZslq7du3wO0qSJPXY9qyRHcmM7E+T/Enzk66zkvwJ8NOxDCZJkiSNl5EE2dcArwBuBW4BDgf8ApgkSZJ6aiRXLfg+8D8noRZJkiRpxIYNss0vab2Ozo8HPLC/15GVJElSL43kl72+AFwCfBXYOrHlSJIkSSMzkiA7v6rePuGVSJIkSaMwki97nZ/khRNeiSRJkjQKg87IJrkTKCDAO5LcC/yquV9VtfPklChJkiT9tkGDbFU9bDILkSRJkkZj2KUFSb42kjZJkiRpMg21tGAesCOwMMnD6SwpANgZ2H0SapMkSZIGNdRVC14PHAc8Criqq/3nwKkTWJMkSZI0rKHWyJ4CnJLk2Kr66CTWJEmSJA1rJNeR3ZLkVf0bq+pfJ6AeSZIkaURGEmSf2rU9D3gOnaUGBllJkiT1zLBBtqqO7b6fZBfgsxNVkLbfhpu30HfCBb0uQ5Jaa/PK5b0uQdIIjOSXvfr7BfCE8S5EkiRJGo1hZ2STfInOL3wBzAJ+DzhrIouSJEmShjOSNbL/0LV9P50we8TElCNJkiSNzEjWyH4jyVOAPwZeDtwEnDPRhUmSJElDGXSNbJInJnlPkuuBjwL/BaSqDq2qSf9BhCSV5ENd949PcuI4j3FikuN7XUuSLzdfqpMkSdIghvqy1/XAs4EXVdXBzY8ibJ2csgZ0L/BHSRb2sIZtJrSWqnphVf2suy0dY/lyniRJ0rQ0VDD6I+AW4OIkpyd5DpDJKWtA9wOrgLf2fyDJi5NckeTqJF9NsluShyTZ3D2zmeS7zWO/tX9Xd/sk+Xaz7+vGo5am/VlJ1jd/Vyd5WJJFSb7ZtG1M8oxm381JFibpS3JDkn8FNgKPGeNrJ0mSNO0MGmSr6vNVdSSwJ3AxcBzwyCQfS/K8Saqvv38CXplkQb/2bwFPq6qn0LnG7V9W1a+BLwAvBUjy+8D3q+pHA+3f1deT6cxEHwi8O8mjtreWpv144E1VtS/wDOCXdNYdX9i07QOsH2CcxcBpVbVXVX1/kFpIsiLJ2iRrt969ZbDdJEmSpo1hP6quql9U1aer6sXAo4GrgbdPeGUD1/JzOr8o9pZ+Dz0auDDJBuB/A3s17Z/jwSssHNncH2p/gC9U1S+r6jY6Af6AcarlUuDDSd4C7FJV9wNXAq9u1tcurao7Bxjq+1V1+UA19KtnVVUtq6pls+b3z9aSJEnTz6jWXFbVHU1ges5EFTQCJwOvBXbsavsocGpVLQVeT+endAG+Dfxukl2BlwDnDrM/PHjN3MHuj6mWqloJ/BmwA3Bpkj2r6pvAM4GbgTOSvGqAMX4xxPiSJEkzVuu+PFRVt9P5QYbXdjUvoBMGAY7u2reA84APA9+pqp8OtX/jsCTzkjwCOITOrOl215Jkj6raUFUnNX3umeRxwI+q6nTgX4D9hnjqkiRJ6tK6INv4ENB9xYATgbOTrANu67fv54A/4cFlBcPtfy2dJQWXA++rqv8ep1qOa77QdS3wK+D/0gnK1yS5ms4SiFOGGUuSJEmNdCYtNZ3MXbS4Fh19cq/LkKTW2rxyea9LkGaMJOuqatlYjm3rjKwkSZJmOIOsJEmSWml2rwvQ+Fu6+wLW+rGYJEma5pyRlSRJUisZZCVJktRKBllJkiS1kkFWkiRJrWSQlSRJUisZZCVJktRKBllJkiS1kkFWkiRJrWSQlSRJUisZZCVJktRKBllJkiS1kkFWkiRJrWSQlSRJUisZZCVJktRKBllJkiS10uxeF6Dxt+HmLfSdcEGvy5AkNTavXN7rEqRpyRlZSZIktZJBVpIkSa1kkJUkSVIrGWQlSZLUSjMmyCZ5aZL1/f5+neQFSQ5Jcv4k17M6ybIR7rssyUcmuiZJkqQ2mTFXLaiq84Dztt1PsgJ4JXAh8Mxe1TUSVbUWWNvrOiRJkqaSGTMj2y3JE4F3A39aVb9umndK8u9Jrk/yqSRp9n13kiuTbEyyqqt9dZKTkqxJcmOSZzTt85OcleS6JOcluWIUM687Jvl40+fVSQ5r2id9xliSJGmqm3FBNskc4NPA26rqv7oeegpwHLAEeALw9Kb91Kp6alXtDewAvKjrmNlVdUBz3HuatjcCd1TVEuBdwP6jKO+dwNebPg8FPphkx1EcL0mSNGPMuCALvA/YVFWf69e+pqp+2MzQrgf6mvZDm1nVDcCzgb26jjm3uV3Xtf/BwGcBqmojcO0oansecEKS9cBqYB7w2JEcmGRFkrVJ1m69e8sohpQkSWqnGbNGFjof0QMvA/Yb4OF7u7a3ArOTzANOA5ZV1Q+SnEgnXPY/Zivj81oGeFlV3dCv7t2GO7CqVgGrAOYuWlzjUIskSdKUNmNmZJM8HPgE8KqqunOEh20Lrbcl2Qk4fATHXAq8ohlzCbB0FGVeCBzbtQ73KaM4VpIkaUaZSTOybwAeCXysyYnb/D3wo4EOqKqfJTkd2AjcClw5gnFOA85Mch1wPbAJGOyz/guS/KrZ/jbwKuBk4NokDwFu4jfX5EqSJKmRKj+FHk9JZgFzquqeJHsAXwWeVFX3TVYNcxctrkVHnzxZw0mShrF55fJelyBNWUnWVdWIrvDU30yakZ0s84GLm6sjBHjjZIZYSZKkmcIgO86a9bdj+leFJEmSRm7GfNlLkiRJ04szstPQ0t0XsNb1WJIkaZpzRlaSJEmtZJCVJElSKxlkJUmS1EoGWUmSJLWSQVaSJEmtZJCVJElSKxlkJUmS1EoGWUmSJLWSQVaSJEmtZJCVJElSKxlkJUmS1EoGWUmSJLWSQVaSJEmtZJCVJElSKxlkJUmS1Eqze12Axt+Gm7fQd8IFvS5DkjRONq9c3usSpCnJGVlJkiS1kkFWkiRJrWSQlSRJUisZZCVJktRKrQqySR6RZH3zd2uSm7vuP3QM/R2T5CdJrk7y3SQXJjlojLX1Jdk4lmMlSZI0eq26akFV/RTYFyDJicBdVfUP29nt56rqzU2fhwLnJjm0qr6znf1KkiRpArVqRnYgSV6X5Mok1yQ5J8n8pr0vydeTXJvka0keO1xfVXUxsApYMUzfuyU5r2m/pmsWd1aS05NsSnJRkh2a/fdI8pUk65JckmTPpv2MJB9JclmS/0xyeNP+kCSnJbk+yX8k+fK2xyRJktTR+iALnFtVT62qfYDvAK9t2j8KnFlVTwY+BXxkhP1dBew5TN8fAb7RtO8HbGraFwP/VFV7AT8DXta0rwKOrar9geOB07rGWwQcDLwIWNm0/RHQBywB/hQ4cLiik6xIsjbJ2q13bxnhU5UkSWqvVi0tGMTeSf4W2AXYCbiwaT+QTiAE+CTwgRH2lxH0/WzgVQBVtRXYkuThwE1Vtb7ZZx3Ql2Qn4CDg7OSBrud2jfH5qvo1cF2S3Zq2g4Gzm/Zbk1w8XNFVtYpOYGbuosU1wucqSZLUWtMhyJ4BvKSqrklyDHDIdvb3FDqzr2Pp+96u7a3ADnRmvX9WVfuO4JgMso8kSZL6mQ5LCx4G3JJkDvDKrvbLgCOb7VcClwzXUZJn0Vkfe/owfX8N+F/NMbOSLBisz6r6OXBTkpc3+yfJPsOUcinwsmat7G5sfziXJEmadqZDkH0XcAWd8Hd9V/uxwKuTXEtnnemfD3L8Ec3lu24E3gG8rOuKBYP1/efAoUk20FlCsGSYGl8JvDbJNXTW0x42zP7nAD8ErgP+jc66XRe+SpIkdUmVyymnoiQ7VdVdSR4BrAGeXlW3juTYuYsW16KjT57Q+iRJk2fzyuW9LkGaMEnWVdWysRw7HdbITlfnJ9kFeCjwvpGGWEmSpJnCIDtFVdUhva5BkiRpKjPITkNLd1/AWj+GkiRJ09x0+LKXJEmSZiCDrCRJklrJICtJkqRWMshKkiSplQyykiRJaiWDrCRJklrJICtJkqRWMshKkiSplQyykiRJaiWDrCRJklrJICtJkqRWMshKkiSplQyykiRJaiWDrCRJklrJICtJkqRWMshKkiSplWb3ugCNvw03b6HvhAt6XYYkSVPe5pXLe12CtoMzspIkSWolg6wkSZJaySArSZKkVmpVkE1ycZLn92s7LsnHxqHvA5KsTvLdJFcluSDJ0jH2dWKS40faLkmSpNFrVZAFPgMc2a/tyKZ9zJLsBpwFvKOqFlfVfsDfA3tsT7+SJEmaOG0Lsv8OLE/yUIAkfcCjgEuSfCzJ2iSbkrx32wFJNid5bzPLuiHJngP0+2bgzKq6bFtDVX2rqj6/bZwkX09ybZKvJXnsUO2jkeSQJOd33T81yTHN9lOTXJbkmiRrkjxstP1LkiRNV60KslV1O7AGeEHTdCRwVlUV8M6qWgY8GXhWkid3HXpbM8v6MWCgj/b3Aq4aYuiP0gm6TwY+BXxkmPbt1oT1zwF/XlX7AM8Ffjle/UuSJLVdq4Jso3t5QfeyglckuQq4mk4wXdJ1zLnN7Tqgb7gBklyR5DtJTmmaDgQ+3Wx/Ejh4mPbx8CTglqq6EqCqfl5V9w9R84pmRnrt1ru3jGMZkiRJU1Mbg+wXgOck2Q+YX1Xrkjyezkzrc5rZ0QuAeV3H3NvcbmXgH4HYBOy37U5V/T7wLmDBBNTf3/385nmYN9iOQ6mqVVW1rKqWzZo/GWVLkiT1VuuCbFXdBVwMfJwHZ2N3Bn4BbGm+uPWCQQ4fzD8BxyQ5qKttftf2ZTw4C/xK4JJh2kfj+8CSJHOT7AI8p2m/AViU5KkASR6WxF9ikyRJarQ1GH0GOI8mRFbVNUmuBq4HfgBcOprOqurWJEcAJyXZHfgxcBvwN80uxwKfSPK/gZ8Arx6mfSh/neS4rrEfneQsYCNwE52lEVTVfU1NH02yA531sc8F7hrNc5MkSZqu0vmelKaTuYsW16KjT+51GZIkTXmbVy7vdQkzXpJ1zRf2R611SwskSZIkMMhKkiSppQyykiRJaqW2ftlLQ1i6+wLWuuZHkiRNc87ISpIkqZUMspIkSWolg6wkSZJaySArSZKkVjLISpIkqZUMspIkSWolg6wkSZJaySArSZKkVjLISpIkqZUMspIkSWolg6wkSZJaySArSZKkVjLISpIkqZUMspIkSWolg6wkSZJaaXavC9D423DzFvpOuKDXZUiSpBbbvHJ5r0sYljOykiRJaiWDrCRJklrJICtJkqRWMshKkiSplaZUkE3yziSbklybZH2S32/aNydZOAHjzUmyMsl3k1yV5NtJXtA8dtc4jjMh9UuSJM1kU+aqBUkOBF4E7FdV9zbB76ETPOz7gEXA3s2YuwHPmuAxJUmSNA6m0ozsIuC2qroXoKpuq6r/7nr82GbWdEOSPQGS7Jjk40nWJLk6yWFN+zFJzk3ylWa29QP9B0syH3gdcGzXmD+qqrO69vm7JNckubwJuSQ5I8nhXfvc1dwekmR1kn9Pcn2STyXJMPWfmOT4rr42Julrtj+fZF0zQ71ie15YSZKk6WgqBdmLgMckuTHJaUn6z4zeVlX7AR8DtoW/dwJfr6oDgEOBDybZsXlsX+AIYClwRJLH9Ovvd4H/qqqfD1LPjsDlVbUP8E06oXc4TwGOA5YATwCePkz9Q3lNVe0PLAPekuQRQ+2cZEWStUnWbr17ywi6lyRJarcpE2Sr6i5gf2AF8BPgc0mO6drl3OZ2HdDXbD8POCHJemA1MA94bPPY16pqS1XdA1wHPG6UJd0HnD/AmENZU1U/rKpfA+v7HTNQ/UN5S5JrgMuBxwCLh9q5qlZV1bKqWjZr/oIRdC9JktRuU2aNLEBVbaUTSFcn2QAcDZzRPHxvc7uVB+sO8LKquqG7n+ZLYvd2NXUfs833gMcm2XmQWdlfVVUNcPz9NP8ASPIQfnMd71BjDlT/A3015jX9HgI8Fziwqu5OsnrbY5IkSeqYMjOySZ6UpHvWcV/g+8McdiGdtadp+njKSMerqruB/wOckuShzfG7Jnn5MIdupjNzDPA/gTkjHXOQvvZrxt4PeHzTvgC4owmxewJP244xJEmSpqUpE2SBnYAzk1yX5Fo660xPHOaY99EJktcm2dTcH42/prOM4bokG+ksJRhszew2pwPPaj72PxD4xSjH7HYO8DtN7W8GbmzavwLMTvIdYCWd5QWSJEnqkgc/Pdd0MXfR4lp09Mm9LkOSJLXY5pXLJ2WcJOuqatlYjp1KM7KSJEnSiBlkJUmS1EoGWUmSJLXSlLr8lsbH0t0XsHaS1rVIkiT1ijOykiRJaiWDrCRJklrJICtJkqRWMshKkiSplQyykiRJaiWDrCRJklrJICtJkqRWSlX1ugaNsyR3Ajf0ug6N2ULgtl4XoTHx3LWX5669PHftte3cPa6qdh1LB/4gwvR0Q1Ut63URGpskaz1/7eS5ay/PXXt57tprPM6dSwskSZLUSgZZSZIktZJBdnpa1esCtF08f+3luWsvz117ee7aa7vPnV/2kiRJUis5IytJkqRWMshOM0n+MMkNSb6X5IRe16OhJdmcZEOS9UnWNm2/k+Q/kny3uX14r+sUJPl4kh8n2djVNuC5SsdHmvfhtUn2613lgkHP34lJbm7ef+uTvLDrsb9qzt8NSZ7fm6qV5DFJLk5yXZJNSf68afe91wJDnL9xe+8ZZKeRJLOAfwJeACwBjkqypLdVaQQOrap9uy5BcgLwtapaDHytua/eOwP4w35tg52rFwCLm78VwMcmqUYN7gx++/wB/GPz/tu3qr4M0Px380hgr+aY05r/vmry3Q+8raqWAE8D3tScH9977TDY+YNxeu8ZZKeXA4DvVdV/VtV9wGeBw3pck0bvMODMZvtM4CW9K0XbVNU3gdv7NQ92rg4D/rU6Lgd2SbJoUgrVgAY5f4M5DPhsVd1bVTcB36Pz31dNsqq6paquarbvBL4D7I7vvVYY4vwNZtTvPYPs9LI78IOu+z9k6P/BqPcKuCjJuiQrmrbdquqWZvtWYLfelKYRGOxc+V5sjzc3H0F/vGsZj+dvCkrSBzwFuALfe63T7/zBOL33DLJSbx1cVfvR+TjsTUme2f1gdS4r4qVFWsBz1UofA/YA9gVuAT7U02o0qCQ7AecAx1XVz7sf87039Q1w/sbtvWeQnV5uBh7Tdf/RTZumqKq6ubn9MXAenY9QfrTto7Dm9se9q1DDGOxc+V5sgar6UVVtrapfA6fz4EeYnr8pJMkcOiHoU1V1btPse68lBjp/4/neM8hOL1cCi5M8PslD6SyY/mKPa9IgkuyY5GHbtoHnARvpnLOjm92OBr7Qmwo1AoOdqy8Cr2q+Qf00YEvXx6CaIvqtnXwpnfcfdM7fkUnmJnk8nS8OrZns+tS5CgHwf4DvVNWHux7yvdcCg52/8XzvzR7fktVLVXV/kjcDFwKzgI9X1aYel6XB7Qac13mfMxv4dFV9JcmVwFlJXgt8H3hFD2tUI8lngEOAhUl+CLwHWMnA5+rLwAvpfFHhbuDVk16wfsMg5++QJPvS+Vh6M/B6gKralOQs4Do637p+U1Vt7UHZgqcDfwpsSLK+aXsHvvfaYrDzd9R4vff8ZS9JkiS1kksLJEmS1EoGWUmSJLWSQVaSJEmtZJCVJElSKxlkJUmS1EoGWUmagZL8jySfTfL/mp9I/nKSJ45j/4ckOWi8+pOkgRhkJWmGaS5Sfh6wuqr2qKr9gb/iwd+rHw+HAAZZSRPKICtJM8+hwK+q6p+3NVTVNcC3knwwycYkG5IcAQ/Mrp6/bd8kpyY5ptnenOS9Sa5qjtkzSR/wBuCtSdYnecZkPjlJM4e/7CVJM8/ewLoB2v8I2BfYB1gIXJnkmyPo77aq2i/JG4Hjq+rPkvwzcFdV/cN4FS1J/TkjK0na5mDgM1W1tap+BHwDeOoIjju3uV0H9E1QbZL0WwyykjTzbAL2H8X+9/Ob/38xr9/j9za3W/GTPkmTyCArSTPP14G5SVZsa0jyZOBnwBFJZiXZFXgmsAb4PrAkydwkuwDPGcEYdwIPG+/CJamb/3KWpBmmqirJS4GTk7wduAfYDBwH7ARcAxTwl1V1K0CSs4CNwE3A1SMY5kvAvyc5DDi2qi4Z7+chSamqXtcgSZIkjZpLCyRJktRKBllJkiS1kkFWkiRJrWSQlSRJUisZZCVJktRKBllJkiS1kkFWkiRJrWSQlSRJUiv9f1QBMP3t4YJ1AAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 720x432 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# 拼接所有的作者\n",
    "authors_names = [' '.join(x) for x in all_authors]\n",
    "authors_names = pd.DataFrame(authors_names)\n",
    "\n",
    "# 根据作者频率绘制直方图\n",
    "plt.figure(figsize=(10, 6))\n",
    "authors_names[0].value_counts().head(10).plot(kind='barh')\n",
    "\n",
    "# 修改图配置\n",
    "names = authors_names[0].value_counts().index.values[:10]\n",
    "_ = plt.yticks(range(0, len(names)), names)\n",
    "plt.ylabel('Author')\n",
    "plt.xlabel('Count')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "接下来统计姓名姓，也就是authors_parsed字段中作者第一个单词："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0.5, 0, 'Count')"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAnIAAAFzCAYAAAC6muStAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAdrklEQVR4nO3df7RdZX3n8ffHBBMjErBBS2PrFQdbgViBqyOIFqS1tWF0rFSdsUXsj9Q67ZRaR+PUWp2utrFiq6NVVrAVuyoVpWIVZupPtFYUuJFgAoLFGpVUdCiS8qOChO/8cfaVY8yPm9x7zr7Pve/XWmfdvZ+99znf/azF4ZNn7+fsVBWSJElqzwP6LkCSJEkHxiAnSZLUKIOcJElSowxykiRJjTLISZIkNcogJ0mS1KilfRfQh1WrVtXExETfZUiSJO3Tpk2bbqmqw3e3bVEGuYmJCaampvouQ5IkaZ+SfGVP27y0KkmS1CiDnCRJUqMMcpIkSY0yyEmSJDXKICdJktQog5wkSVKjFuXPj2zZvoOJ9Zf2XYbGYNuGtX2XIEnSyDgiJ0mS1CiDnCRJUqMMcpIkSY0yyEmSJDWqlyCX5NlJNu/yui/JM5Jc0kdNkiRJrell1mpVXQxcPL2eZB3wAuDuPuqRJElqUe+XVpM8Bng18IvAfcDBSS5Kcn2SdyVJt99pSa5OsiXJXyZZ1rVvS7KqW55M8omeTkWSJGmseg1ySQ4CLgB+p6q+2jUfB5wNHA0cCTw5yXLgfOB5VbWGwUjir+/nZ61LMpVkauddO+boDCRJkvrT94jcHwDXVtWFQ21XVtVNVXUfsBmYAH4U+HJVfbHb553AU/fng6pqY1VNVtXkkhUrZ1+5JElSz3p7skOSU4DnAMfvsmn4Prmd7LvGe7k/kC6fi9okSZJa0Nes1cOAdwBnVtXtMzjkBmAiyX/o1n8R+GS3vA04oVt+zlzWKUmSNJ/1dWn1xcDDgLcN/wQJ8PDd7VxV3wZeBLw3yRYGkyLO7Ta/FnhTkikGI3iSJEmLQqqq7xrGbtkRR9URL3xj32VoDLZtWNt3CZIkzUqSTVU1ubttfU92kCRJ0gEyyEmSJDWqt1mrfVqzeiVTXnKTJEmNc0ROkiSpUQY5SZKkRhnkJEmSGmWQkyRJapRBTpIkqVEGOUmSpEYZ5CRJkhplkJMkSWqUQU6SJKlRBjlJkqRGGeQkSZIaZZCTJElqlEFOkiSpUQY5SZKkRhnkJEmSGrW07wL6sGX7DibWX9p3GVrgtm1Y23cJkqQFzhE5SZKkRhnkJEmSGmWQkyRJapRBTpIkqVEjC3JJ7thl/awkbxnV50mSJC02jshJkiQ1qpcgl+T8JGcMrd/R/T04yceSfC7JliTP6tonknwhyXlJrk3y4SQP6rY9Icnnk2xO8vokW/s4J0mSpHEbZZB7UBeuNifZDPyvGRzzbeDZVXU8cCrwhiTpth0F/HlVHQPcBjyna38H8GtV9Xhg5xzWL0mSNK+N8geB/70LV8DgHjlgch/HBPijJE8F7gNWAw/vtn25qjZ3y5uAiSSHAg+pqs907RcAp+/2jZN1wDqAJYccvp+nIkmSNP/0dY/cvdOfneQBwAO79hcAhwMndCHwG8DybtvdQ8fvZD9DaFVtrKrJqppcsmLlLEqXJEmaH/oKctuAE7rlZwIHdcsrgW9W1XeSnAo8cm9vUlW3Abcn+Y9d0/PnvlRJkqT5qa8gdx7wE0muAU4E7uza3wVMJtkCnAlcP4P3+mXgvO4+vAcDO+a+XEmSpPlnZPfIVdXBu6yfD5zfLX8DeNLQ5ld07bcwCHa7c+zQe50z1H5tVT0OIMl6YGqWpUuSJDVhlJMdxmVtklcyOJevAGf1W44kSdJ4NB/kqupC4MK+65AkSRo3n+wgSZLUqOZH5A7EmtUrmdqwtu8yJEmSZsUROUmSpEYZ5CRJkhplkJMkSWqUQU6SJKlRBjlJkqRGGeQkSZIaZZCTJElqlEFOkiSpUQY5SZKkRhnkJEmSGmWQkyRJapRBTpIkqVEGOUmSpEYZ5CRJkhplkJMkSWrU0r4L6MOW7TuYWH9p32VIbNuwtu8SJEkNc0ROkiSpUQY5SZKkRhnkJEmSGmWQkyRJatS8DXJJ/izJ2UPrH0ry9qH1NyR5aS/FSZIkzQPzNsgBnwZOAkjyAGAVcMzQ9pOAy3uoS5IkaV6Yz0HucuDEbvkYYCtwe5LDkiwDHgv89yRnTB+Q5I7xlylJktSPefs7clX1L0nuTfIjDEbfPgOsZhDudgBbgHtm+n5J1gHrAJYccvjcFyxJkjRm83lEDgajcidxf5D7zND6p/fnjapqY1VNVtXkkhUr57xQSZKkcZvvQW76Prk1DC6tfpbBiNz0/XH30p1Ddx/dA/spU5Ikafzme5C7HDgduLWqdlbVrcChDMLc5cA24IRu32cCB/VQoyRJUi/me5DbwmC26md3adtRVbcA5wE/keQaBuHuzvGXKEmS1I95O9kBoKp2Aofs0nbW0PI3gCcNbX7FeCqTJEnq33wfkZMkSdIeGOQkSZIaNa8vrY7KmtUrmdqwtu8yJEmSZsUROUmSpEYZ5CRJkhplkJMkSWqUQU6SJKlRBjlJkqRGGeQkSZIaZZCTJElqlEFOkiSpUQY5SZKkRhnkJEmSGmWQkyRJapRBTpIkqVEGOUmSpEYZ5CRJkhplkJMkSWqUQU6SJKlRS/suoA9btu9gYv2lfZchAbBtw9q+S5AkNcoROUmSpEYZ5CRJkhplkJMkSWpUE0EuyQ8n+XKSh3brh3XrEz2XJkmS1JsmglxVfQ14G7Cha9oAbKyqbb0VJUmS1LMmglznz4AnJTkbOBk4J8kpSS6Z3iHJW5Kc1VN9kiRJY9XMz49U1XeS/A/g74Gnd+t9lyVJktSblkbkAJ4BfB04dn8PTLIuyVSSqZ137Zj7yiRJksasmSCX5PHATwFPAn47yRHAvXzvOSzf0/FVtbGqJqtqcsmKlSOtVZIkaRyaCHIZXEN9G3B2VX0VeD1wDvAV4Ogky5IcCpzWX5WSJEnj1USQA34V+GpVfaRbfyvwWOBI4D3A1u7v1f2UJ0mSNH5NTHaoqo3AxqH1ncDx3eongZf3UZckSVKfWhmRkyRJ0i4McpIkSY0yyEmSJDWqiXvk5tqa1SuZ2rC27zIkSZJmxRE5SZKkRhnkJEmSGmWQkyRJapRBTpIkqVEGOUmSpEYZ5CRJkhplkJMkSWqUQU6SJKlRBjlJkqRGGeQkSZIaZZCTJElqlEFOkiSpUQY5SZKkRhnkJEmSGmWQkyRJatTSvgvow5btO5hYf2nfZUiL3rYNa/suQZKa5oicJElSowxykiRJjTLISZIkNcogJ0mS1KjeglwG/jHJM4bafj7J3/dVkyRJUkt6m7VaVZXkxcB7k1zW1fJHwM/0VZMkSVJLev35karamuSDwCuABwN/Dfx1kuXAvwMvqqobkpwFPBNYATwauLiqXg6Q5Je7428DrgHurqrfGPe5SJIkjdt8+B251wKfA+4BTgb+sKruTfKTDEbontPt93jgOOBu4IYkbwZ2Ar8HHA/cDnycQZj7PknWAesAlhxy+KjORZIkaWx6D3JVdWeSC4E7gEOA85McBRRw0NCuH6uqHQBJrgMeCawCPllVt3bt7wUes4fP2QhsBFh2xFE1otORJEkam71OdkiyJMk5Y6jjvu71B8BlVXUs8J+A5UP73D20vJN5EEIlSZL6tNcgV1U7GVzuHJeVwPZu+awZ7H8V8BNJDkuylPsvw0qSJC14MxnVujrJB4D3AndON1bV+0ZQz58A70zyKmCfD0Otqu1J/gi4ErgVuB7YMYK6JEmS5p2ZBLnlwL8CTxtqK2DOglxVvWZodfget1d1288Hzh/a//ShfS6oqo3diNzFwPvnqi5JkqT5bJ9BrqpeNI5CZuE13QzX5cCHMchJkqRFYp9BLskjgDcDT+6aPgX8VlXdNMrCZqqqXtZ3DZIkSX2YySO63gF8APih7vXBrk2SJEk9StXef1Ityeaqevy+2loyOTlZU1NTfZchSZK0T0k2VdXk7rbNZETuX5P8QvebckuS/AKDyQ+SJEnq0UyC3C8BzwVuBr4OnAHM9wkQkiRJC95MZq1+hcED6yVJkjSPzGTW6uHArwITw/tX1S+NrixJkiTty0x+EPjvGPzkyEcZPONUkiRJ88BMgtyKqnrFyCuRJEnSfpnJZIdLkvzsyCuRJEnSftnjiFyS2xk8UzXA/0xyN/Cdbr2q6pDxlChJkqTd2WOQq6qHjLMQSZIk7Z99XlpN8rGZtEmSJGm89nZpdTnwYGBVksMYXFIFOARYPYbaJEmStBd7m7X6a8DZwA8Bnxtq/zfgLSOsSZIkSTOwt3vk3gS8KclvVtWbx1iTJEmSZmAmvyO3I8mZuzZW1V+NoB5JkiTN0EyC3BOGlpcDpzG41GqQkyRJ6tE+g1xV/ebwepJDgXePqqBx2LJ9BxPrL+27DEnaq20b1vZdgqR5biZPdtjVncCRc12IJEmS9s8+R+SSfJDBEx4AlgCPBd4zyqIkSZK0bzO5R+6coeV7GYS5542mHEmSJM3UTO6R+2SS44D/Cvw88GXgb0ddmCRJkvZub092eAzwX7rXLcCFQKrq1Ln44CQ/CLyRwazY24BvAO8HnllVp8/FZ0iSJC1kexuRux74FHB6Vd0IkOS35+JDkwS4GHhnVT2/a/tx4Jlz8f6SJEmLwd5mrf4c8HXgsiTnJTmN+5+3OlunAt+pqnOnG6rqGgbB8eAkFyW5Psm7utBHkhOSfDLJpiQfSnJE1/6JJK9LcmWSLyZ5yhzVKEmSNK/tMchV1fu70bIfAy5j8NzVhyV5W5Knz/JzjwU27WHbcd1nHc3gZ06enOQg4M3AGVV1AvCXwB8OHbO0qp7YHff7u3vTJOuSTCWZ2nnXjlmWL0mS1L+ZTHa4E7gAuCDJYQwmPLwC+PCIarqyqm4CSLIZmGBwD92xwEe6AbolDEYLp72v+7up2//7VNVGYCPAsiOOqt3tI0mS1JKZ/PzId1XVtxiEoY2z/NxrgTP2sO3uoeWdDGoMcG1VnbiPY6b3lyRJWvAO5MkOc+HjwLIk66YbkjwO2NP9bTcAhyc5sdv3oCTHjL5MSZKk+auXIFdVBTwb+MkkX0pyLfDHwM172P8eBiN4r0tyDbAZOGlM5UqSJM1LvV2GrKp/AZ67m03nDe3zG0PLm4Gn7uZ9ThlavoU93CMnSZK00PR1aVWSJEmzZJCTJElq1KKc4blm9UqmNqztuwxJkqRZcUROkiSpUQY5SZKkRhnkJEmSGmWQkyRJapRBTpIkqVEGOUmSpEYZ5CRJkhplkJMkSWqUQU6SJKlRBjlJkqRGGeQkSZIaZZCTJElqlEFOkiSpUQY5SZKkRhnkJEmSGrW07wL6sGX7DibWX9p3GZK0KGzbsLbvEqQFyxE5SZKkRhnkJEmSGmWQkyRJapRBTpIkqVHzPsgluWM3bS9OcmYf9UiSJM0XTc5arapz+65BkiSpb/N+RG53krwmycu65U8kmeyWVyXZ1mtxkiRJY9JkkJMkSdIiCnJJ1iWZSjK1864dfZcjSZI0awshyN3L/eexfE87VdXGqpqsqsklK1aOpzJJkqQRWghBbhtwQrd8Ro91SJIkjVULQW5FkpuGXi/dZfs5wK8nuRpY1UN9kiRJvZj3Pz9SVXsNm1V1PfC4oaZXjbYiSZKk+aGFETlJkiTthkFOkiSpUQY5SZKkRs37e+RGYc3qlUxtWNt3GZIkSbPiiJwkSVKjDHKSJEmNMshJkiQ1yiAnSZLUKIOcJElSowxykiRJjTLISZIkNcogJ0mS1CiDnCRJUqMMcpIkSY0yyEmSJDXKICdJktQog5wkSVKjDHKSJEmNMshJkiQ1amnfBfRhy/YdTKy/tO8yJEmaM9s2rO27BPXAETlJkqRGGeQkSZIaZZCTJElqlEFOkiSpUU0HuSR37KbtxUnO7KMeSZKkcVpws1ar6ty+a5AkSRqHpkfkdifJa5K8rO86JEmSRm3BBbk9SbIuyVSSqZ137ei7HEmSpFlbNEGuqjZW1WRVTS5ZsbLvciRJkmZt0QQ5SZKkhcYgJ0mS1KjWZ62uSHLT0Pqf9laJJEnSmDUd5KrKEUVJkrRoGYQkSZIaZZCTJElqVNOXVg/UmtUrmdqwtu8yJEmSZsUROUmSpEYZ5CRJkhplkJMkSWqUQU6SJKlRBjlJkqRGGeQkSZIaZZCTJElqlEFOkiSpUQY5SZKkRhnkJEmSGmWQkyRJapRBTpIkqVEGOUmSpEYZ5CRJkhplkJMkSWqUQU6SJKlRS/suoA9btu9gYv2lfZchSdLIbduwtu8SNEKOyEmSJDXKICdJktQog5wkSVKjRhbkkjw7yeZdXvcleUaSS0b1uZIkSYvFyCY7VNXFwMXT60nWAS8A7h7VZ0qSJC0mY7m0muQxwKuBXwTuAw5OclGS65O8K0m6/V6d5KokW5NsHGr/RJLXJbkyyReTPKVrX5HkPUmuS3JxkiuSTI7jnCRJkvo28iCX5CDgAuB3quqrXfNxwNnA0cCRwJO79rdU1ROq6ljgQcDpQ2+1tKqe2B33+13bS4BvVdXRwO8BJ4zwVCRJkuaVcYzI/QFwbVVdONR2ZVXdVFX3AZuBia791G5UbQvwNOCYoWPe1/3dNLT/ycC7AapqK/D5PRWRZF2SqSRTO+/aMbszkiRJmgdG+oPASU4BngMcv8um4fvkdgJLkywH3gpMVtXXkrwGWL6bY3ZyAHVX1UZgI8CyI46q/T1ekiRpvhnlrNXDgHcAZ1bV7TM4ZDq03ZLkYOCMGRzzaeC53ecdDaw5kFolSZJaNMoRuRcDDwPe1s1ZmPbHu9u5qm5Lch6wFbgZuGoGn/FW4J1JrgOuB64FvG4qSZIWhVS1e5UxyRLgoKr6dpJHAx8FfrSq7tnbccuOOKqOeOEbx1GiJEm98lmr7Uuyqap2+6scI71HbgxWAJd1M2MDvGRfIU6SJGmhaDrIdffe+btxkiRpUfJZq5IkSY1qekTuQK1ZvZIp7xmQJEmNc0ROkiSpUQY5SZKkRhnkJEmSGmWQkyRJapRBTpIkqVEGOUmSpEYZ5CRJkhplkJMkSWqUQU6SJKlRBjlJkqRGGeQkSZIaZZCTJElqlEFOkiSpUQY5SZKkRhnkJEmSGrW07wL6sGX7DibWX9p3GZIkqVHbNqztuwTAETlJkqRmGeQkSZIaZZCTJElqlEFOkiSpUWMLckn+LMnZQ+sfSvL2ofU3JHnpuOqRJElq3ThH5D4NnASQ5AHAKuCYoe0nAZePsR5JkqSmjTPIXQ6c2C0fA2wFbk9yWJJlwGOBpye5KsnWJBuTBCDJJ5K8LsmVSb6Y5Cld+4ok70lyXZKLk1yRZHKM5yRJktSbsQW5qvoX4N4kP8Jg9O0zwBUMwt0ksAV4S1U9oaqOBR4EnD70Fkur6onA2cDvd20vAb5VVUcDvwecsKfPT7IuyVSSqZ137Zjbk5MkSerBuCc7XM4gxE0Huc8MrX8aOLUbVdsCPI3vvfT6vu7vJmCiWz4ZeDdAVW0FPr+nD66qjVU1WVWTS1asnLMTkiRJ6su4g9z0fXJrGFxa/SyDEbnp++PeCpxRVWuA84DlQ8fe3f3dySJ9IoUkSdKwPkbkTgduraqdVXUrcCiDMDc90eGWJAcDZ8zg/T4NPBcgydEMAqIkSdKiMO6RrS0MZqtesEvbwVV1S5LzGIzU3QxcNYP3eyvwziTXAdcD1wLeACdJkhaFsQa5qtoJHLJL21lDy68CXrWb404ZWr6F+++R+zbwC1X17SSPBj4KfGWu65YkSZqPWr/XbAVwWZKDgAAvqap7eq5JkiRpLJoOclV1O4OfLpEkSVp0fNaqJElSo5oekTtQa1avZGrD2r7LkCRJmhVH5CRJkhplkJMkSWqUQU6SJKlRBjlJkqRGGeQkSZIaZZCTJElqlEFOkiSpUamqvmsYuyS3Azf0XccCtQq4pe8iFiD7dXTs29GwX0fHvh2d+dq3j6yqw3e3YVH+IDBwQ1X5aK8RSDJl3849+3V07NvRsF9Hx74dnRb71kurkiRJjTLISZIkNWqxBrmNfRewgNm3o2G/jo59Oxr26+jYt6PTXN8uyskOkiRJC8FiHZGTJElq3qIKckl+JskNSW5Msr7velqQ5C+TfDPJ1qG2hyb5SJJ/6v4e1rUnyf/u+vfzSY4fOuaF3f7/lOSFfZzLfJLkh5NcluS6JNcm+a2u3b6dpSTLk1yZ5Jqub1/btT8qyRVdH16Y5IFd+7Ju/cZu+8TQe72ya78hyU/3dErzSpIlSa5Ockm3br/OgSTbkmxJsjnJVNfm98EcSHJokouSXJ/kC0lOXFB9W1WL4gUsAb4EHAk8ELgGOLrvuub7C3gqcDywdajtT4D13fJ64HXd8s8C/xcI8CTgiq79ocA/d38P65YP6/vceu7XI4Dju+WHAF8EjrZv56RvAxzcLR8EXNH12XuA53ft5wK/3i2/BDi3W34+cGG3fHT3PbEMeFT3/bGk7/Pr+wW8FLgAuKRbt1/npl+3Aat2afP7YG769p3Ar3TLDwQOXUh9u5hG5J4I3FhV/1xV9wDvBp7Vc03zXlX9A3DrLs3PYvAfBt3f/zzU/lc18Fng0CRHAD8NfKSqbq2qbwEfAX5m5MXPY1X19ar6XLd8O/AFYDX27ax1fXRHt3pQ9yrgacBFXfuufTvd5xcBpyVJ1/7uqrq7qr4M3Mjge2TRSvIIYC3w9m492K+j5PfBLCVZyWBA4i8AquqeqrqNBdS3iynIrQa+NrR+U9em/ffwqvp6t3wz8PBueU99bN/vRXfJ6TgGI0f27RzoLv9tBr7J4Av3S8BtVXVvt8twP323D7vtO4AfwL7dnTcCLwfu69Z/APt1rhTw4SSbkqzr2vw+mL1HAf8PeEd3S8DbkzyYBdS3iynIaQRqMObs1OcDlORg4G+Bs6vq34a32bcHrqp2VtXjgUcwGO35sX4ral+S04FvVtWmvmtZoE6uquOBZwD/LclThzf6fXDAljK4PehtVXUccCeDS6nf1XrfLqYgtx344aH1R3Rt2n/f6Iaa6f5+s2vfUx/b97uR5CAGIe5dVfW+rtm+nUPdJZTLgBMZXCKZfizhcD99tw+77SuBf8W+3dWTgWcm2cbg1pSnAW/Cfp0TVbW9+/tN4GIG/wDx+2D2bgJuqqoruvWLGAS7BdO3iynIXQUc1c2weiCDm28/0HNNrfoAMD1j54XA3w21n9nN+nkSsKMbuv4Q8PQkh3Uzg57etS1a3b1CfwF8oar+dGiTfTtLSQ5Pcmi3/CDgpxjcg3gZcEa32659O93nZwAf7/6F/gHg+d3sy0cBRwFXjuUk5qGqemVVPaKqJhh8f368ql6A/TprSR6c5CHTywz+O96K3wezVlU3A19L8qNd02nAdSykvu17tsU4Xwxmo3yRwf0yv9t3PS28gL8Bvg58h8G/bH6ZwX0uHwP+Cfgo8NBu3wB/3vXvFmBy6H1+icFNzTcCL+r7vPp+ASczGMr/PLC5e/2sfTsnffs44Oqub7cCr+7aj2QQGG4E3gss69qXd+s3dtuPHHqv3+36/AbgGX2f23x5Aadw/6xV+3X2/Xkkg5m81wDXTv//ye+DOevfxwNT3XfC+xnMOl0wfeuTHSRJkhq1mC6tSpIkLSgGOUmSpEYZ5CRJkhplkJMkSWqUQU6SJKlRBjlJ2o0kP5jk3Um+1D026f8kecwcvv8pSU6aq/eTtDgZ5CRpF90PNl8MfKKqHl1VJwCv5P7nMc6FUwCDnKRZMchJ0vc7FfhOVZ073VBV1wD/mOT1SbYm2ZLkefDd0bVLpvdN8pYkZ3XL25K8NsnnumN+LMkE8GLgt5NsTvKUcZ6cpIVj6b53kaRF51hgdw+H/zkGvxL/48Aq4Kok/zCD97ulqo5P8hLgZVX1K0nOBe6oqnPmqmhJi48jcpI0cycDf1NVO6vqG8AngSfM4Lj3dX83ARMjqk3SImSQk6Tvdy1wwn7sfy/f+326fJftd3d/d+KVEElzyCAnSd/v48CyJOumG5I8DrgNeF6SJUkOB57K4IHwXwGOTrIsyaHAaTP4jNuBh8x14ZIWF/9lKEm7qKpK8mzgjUleAXwb2AacDRwMXAMU8PKquhkgyXuArcCXgatn8DEfBC5K8izgN6vqU3N9HpIWvlRV3zVIkiTpAHhpVZIkqVEGOUmSpEYZ5CRJkhplkJMkSWqUQU6SJKlRBjlJkqRGGeQkSZIaZZCTJElq1P8HvkGFcNubpKoAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 720x432 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "authors_lastnames = [x[0] for x in all_authors]\n",
    "authors_lastnames = pd.DataFrame(authors_lastnames)\n",
    "\n",
    "plt.figure(figsize=(10, 6))\n",
    "authors_lastnames[0].value_counts().head(10).plot(kind='barh')\n",
    "\n",
    "names = authors_lastnames[0].value_counts().index.values[:10]\n",
    "_ = plt.yticks(range(0, len(names)), names)\n",
    "plt.ylabel('Author')\n",
    "plt.xlabel('Count')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
